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Abstract 


An overview of the methods of line-by-line and whole-band analysis developed 
by The Ohio State University spectroscopy group is presented. 
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Introduction OF POOR QUALITY 

One of the objectives of our group is the collection end analysis of laboratory 
spectra of molecules of atmospheric Interest. In recent years we have developed efficient 
techniques for extracting the maximum amount of information from these spectra with a 
minimum of observer bias. A description of the approach to the analysis of single lines 
and entire bands, taken by the members of our group listed in Table 1, is described here. 
More detailed descriptions are given elsewhere (see recerences). 


Single Line Analysis 

A typical problem in spectral analysis is the determination of line positions. One 

method is to estimate the line center and to determine its position with respect to other 

calibration features by interpolation, as indicated in Fig. 1. This method is often 

adequate but it is tedious if there are many lines to be measured. It is also difficult to 

estimate the precision of the results since the judgments used to determine the line 

center and in the interpolation process cannot be quantified. 

Less reliance is placed on the observer if the experimental signals I (x > ) are 

exp i 

compared with the corresponding signals I (u ) of an artificial line centered at v (art) as 

art i o 

shown in Figure 2a. The spectrum of the differences 

A(v ) a I (u) - I (v), 
i exp i art i 

shown in Fig. 2b, indicates the sensitivity of the data to changes in the line position. This 

sensitivity can also be tested by calculating the partial derivative 31 (u )/3\> . A best 

exp i i 

estimate of the line position u is obtained by shifting the position v (art) until the sum of 

o o 

the squares of the differences 
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SS * EAI (v ) 
i 

is a minimum as indicated in Fig. 2c. The rate of change of the SS with respect to the 

difference A\> = v (exp) - v (art) gives a measure of the precision with which the line 
o o 

center can be found. Unlike the first method, once the model for the artificial line and 
the criteria for establishing the uncertainty of the line position have been defined, no 


other observer judgment is required. 


Other parameters can also be determined by this method. By generating 


artificial lines I(v , S ) as shown in Fig. 2d, corresponding to lines of different 
i art 

intensities, and minimi zing 


SS - EAI (v , S ), 
i art 

where I(u , S ) = I (u ) - I(v , S ), 

i art exp i i art 

estimates of the line intensity and its precision can be obtained. In this method 


(1) The intensity and position estimates, and also estimates of other variables p^ 
which affect the signals I(v^), are obtained from the same set of experimental 
data. 


(2) Estimates of the goodness of fit to I (u ) are obtained from the sums SS. These 

exp i 

minima should not be significantly larger than those due to the noise in the data. 

(3) Good fits are only obtained if the models are accurate. The accuracy of the 
models as well as the parameter estimates p^ can be judged by examination of the 
residuals spectra AI(u^). 

(4) All the information about each model and its parameters contained in the spectra 
can be extracted since all the data are examined. 


(5) The ultimate precision is dependent on the characteristics of the experimental 
data. Some of these characteristics are described by the partial derivatives 


V 
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31 (v)/3p . 

exp ) 

(6) All of the parameters can be estimated simultaneously by using appropriate 
least squares regression techniques. 

Niple (8,9) and Tu (10) have considered some of the limitations of the 
experimental data by analyzing simulated spectra of single lines. They have shown that 
the precisions of the retrieved parameters depend on 

(a) the SNR of the spectrum, 

(b) the number N of data points analyzed, 

o 

(c) the spacing 6v of the data points, 

(d) the number of models required and parameters retrieved, 

(e) the accuracy of the models, 

(f ) the magnitudes and shapes of the partial derivatives dl/dp^, and 

(g) the correlations between the parameters p^. 

It is necessary to use a minimum of four models containing at least six adjustable 

param eters to create a set of simulated signals I (v ) suitable for analysis. These are 

exp i 

briefly described in Table 2. Tu (10) has investigated the information content in spectra 
similar to those in Fig. 3, calculated for a Lorentz line, a triangular ISRF with H =« a, a 

4 

background and baseline independent of v, and a SNR of 10 . He has estimated the 
dependence of the fractional uncertainty AS/S in the retrieved line intensity on S as 
shown In Fig. 4 for the case where all six parameters are retrieved. Because of increasing 
correlations between some of the parameters there is a limited rauge of S values for 
which satisfactory retrievals can be made. These correlations are related to the 
derivatives 3l(v)/3p shown in Fig. 5. 
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Some improvement in these frictional uncertainties is obtained by fixing one or 
more of the parameters, although this may introduce systematic errors if these 
parameters are fixed to incorrect values. It is seen, from Fig. 4, that, at small S values, 
the greatest improvement is obtained by fixing the baseline. 

From these studies quantitative evaluations of experimental designs can be 
made These have yielded some interesting results in addition to confirming the intuitive 
condi tions obtained by less rigorous analyses of experimental spectra. In particular 

(1) The usable range can be determined. This is the range of experimental 
conditions for which all six parameters of these simple models can be estimated. 

(2) For the cases considered here this range is primarily limited by correlations 
between S, B, a, and H. 

(3) The usable range can be extended by fixing one or more of the adjustable 
parameters. 

(4) The systematic errors introduced by fixing the parameters may be larger than 
the uncertainties of the parameters retrieved by the regression methods used 
here. This is especially likely if the fixed parameter was highly correlated with 
free parameter. 


Whole Band Analysis 

For many atmospheric problems it is sufficient to know the intensity, width, and 


position of a few lines. However, these data also serve as the raw material from which 



2. the number of parameters retrieved is relatively small and this allows their 
precision to be increased, 

3. the usable range is larger than that for single lines, because many of the highly 
correlated parameters previously measured are no longer estimated directly, and 

4. the accuracy of the band models can be rigorously tested. 

When the band systems in Fig. 6 were analyzed (5) by this method the residuals 
spectrum shown in the lower part of the figure was obtained. The weak Q branch near 

—i 

2615 cm and associated P and R branch lines are fust above the noise level and are too 
weak to be modelled satisfactorily. Bven so, the sum of the squares of the differences is 
within 50% of that expected from the noise and this is considered to be a good retrieval. 

This agreement confirms that the line shape, the band models, the 1SRF, and 
other instrumental effects have been modelled sufficiently well to describe the 
experimental data. These data have an estimated SNR of 200-300 and were obtained with 

-i 

a Fourier Transform Spectrometer (FTS) having a spectral resolution of about 0.06 cm 
The results of the analysis of this band system have been described by Hoke (5, 6). The 

-i 

center of the principal band was estimated to occur at 2614.241224 ± 0.000055 cm , this 
precision is at least an order of magnitude better than is usually claimed from 
measurements made with instruments of similar resolution and performance. 

If the background, baseline, and ISRF parameters are retrieved, together with 
the band intensity and line width parameters, then the line intensities and widths can be 
estimated to about 2%. In all other reported measurements of similar quantities these 
instrumental parameters have been assumed known. When our analyses were repeated 
with these same assumptions the uncertainties in the line intensities and widths were 


more fundamental molecular properties can be derived, provided a sufficiently large 

number of lines in a vibration- rotation band can be analyzed. 

The strongest band of CO shown in Fig. 6 contains over 100 lines. If a 

2 

line-by-line analysis on this band were performed by the methods just described up to fOO 
parameter values would be required. However, the spectrum in Fig. 6 contains three CO 

2 

bands with lines of measurable intensity and thus more than 1000 parameters may be 

required to describe this spectrum. Many of these lines lie outside the usable range 

described above and most are overlapped and blended with other lines. Unless the 

investigator makes arbitrary decisions concerning some of the adjustable parameters in 

these line by line models, only a small fraction of the lines can be analyzed satisfactorily. 

However, the parameters describing these lines are not independent and the 

models for the background and baseline can be assumed to be smoothly varying functions 

over the band. Thus the spectrum in Fig. 6 can be described by models similar to those in 

Table 2 and a series of additional models which relate the parameters associated with the 

individual lines (5, 6). The models describing these additional relationships and the typical 

numbers of adjustable parameters in these models are shown in Table 3. By extending the 

analysis of the previous section to the retrieval of the parameters in this table, the 

problem is reduced to estimating of the order of 40 parameters rather than over 1000. We 

have used this technique to analyze overlapping bands of CO, CO , and N O which contain 

2 2 

over 3000 data values and several hundred lines fsee Refs. 3-7). The advantages of this 
approach are 

1. the parameters of primary interest are found directly from the experimental 


data, 


reduced to less than 0.1%. 


Portions of the experimental spectrum, the corresponding calculated spectrum, 
and the residuals spectrum obtained during the course of this investigation are shown in 
Pig. 7. Although it is difficult to detect significant differences between the observed and 
calculated spectra the residuals spectrum shows systematic variations attributed to poor 
modelling of the ISRF. These systematic differences are not seen in the final residuals 
spectrum shown in Fig. 8. 

In our analysis of CO, CO and N O bands we have consistently obtained stable, 

2 2 

reproducible solutions, but the size of the data sets which can be handled and the numoer 
of parameters retrieved is limited by the computer resources available. As these 
resources expand it is expected that several spectra of the same band and/or several band 
systems will be analyzable simultaneously. The band systems of most molecules are 
described by more complicated models than those required in these investigations. In 
these cases the whole band methods of analysis can be powerful tools for testing the 
accuracies of these models and for identifying the significant parameters. 

These analytical methods involve the application of standard statistical 
techniques to the analysis of typical experimental data. However they have advantages 
over the more conventional methods of spectral analysis which are similar to the 
advantages claimed for FTS over dispersive spectroscopic techniques. These are usually 
described as the throughput and multiplex advantages. 

By analogy with these comparisons the multiplex advantage of whole band 
analysis over line-by-line analysis is the direct, simultaneous retrieval of the parameters 
of interest. There is also a corresponding throughput advantage in that the entire data set 


is searched for information about each parameter. 
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Table 2 Single Line Models and Parameters 
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1 Line Shape 

(Lorentz, Voigt, ... ) 


2 Instrument Spectral 
Response Function 
(ISRF) 

(sine, triangle 
Gaussian, ... ) 


3 Background to 
iine (polynomial 
In v and other 
terms) 


4 Baseline to line 
(polynomial in \>) 


Synthetic spectrum 






Table 3 Models for Describing Entire Bands 


Quantity 

Model 

, ■ — 

Number of Parameters 

Line Intensities 

Band Intensity 

S_ - ES(llnes) 
D 

— 3 if interaction terms 
are lnc!(uded(the sample 
temperature and 
are also required) 

Line PoM 1 ioi.s 

u . E' - E M 

For CO ^ the upper and 
lower state energies can 
each be modelled by 
expressions containing 
3 or 4 parameters 

<8 

Line widths 

The variition in the Lorentz 
widths o rtj the band is 
modelled hv empirical relations. 

'10 

ISRF 

The models contain 3-4 
parameters which are assumed 
constant over the band. 

<4 

— — 

I (v.) 
o 1 

Several parameters may be 
required if channel spectra 

are present. 

<10 

BfVj) 

Usually 2 terms are adequate. 

<3 


Total number of parameters retrieved 30 - 40. 













Figure Captions 


Pig. 1. 


Fig. 2. 


Fig. 3. 


Fig. 4. 


Fig. 5. 


Fig. 6. 


Line positions by interpolation. The posit* on of the )th line is 
-1 

(2622 + Ax /Ax ] cm 

) 1 


a) The spectrum I (x> ) of an artificial line is compared with the experimental 
art i 

data I (v ). b) the differences AKu ) are similar in shape to the partial 
exp i i 

derivative 31 (\> )/6\>. c) the best estimate of the Jine position occurs at the 

exp i 

2 

minimum value of the sum EAI (x>)-, d), e), fl corresponding steps required to 
determine the line intensity. 


A set of lines generated by using a Lorentz line shape, a triangular ISRF with 
-1 

H = a =0.1 cm , and constant I an'* B functions. 

o 


The dependence of the fractional uncertainty in the retrieved line intensity on 

the line intensity for the case of 6 retrieved parameters and for the cas«s where 

B, I , and H are fixed to their correct values, 
o 


The derivatives 3l(v> )/3p as functions of v> for the parameters p = S, a, v , H, I 
i ) j o o 

and B and the dependence of these derivatives on the line intensity S. 


-1 

Upper curve: a spectrum of CO near 2600 cm . Lower curve: the residuals 

2 

spectrum after modelling the entire spectrum and retrieving more than 30 


adjustable parameters. 


Fig. 7. The upper portion of this figure shows a comparison between a portion of the 
experimental spectrum in Fig. 6 and a calculated spectrum. The residuals 
spectrum in the lower part of this figure has systematic variations attributed to 
incorrect modelling. 

Fig. 8. The upper part of this figure is the same portion of the experimental spectrum as 
that in Fig. 7. The lower part of this figure is the residuals spectrum obtained by 
using the best estimates for the parameters. 












Fractional Uncertainty in Intensity, AS/S 
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